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Substitute Specification (Without Markings) 

Caption MP3 Player 

CROSS REFERENCE TO RELATED APPLICATION(S) 

This application claims priority to international application number 
5 WO/2000/41 175 filed under the patent cooperation treaty, which claims priority to 

Korean Patent Application No. 199/235, filed on January 8, 1999, which are hereby 
incorporated herein by reference. 

BACKGROUND OF THE INVENTION 

[Field of the Invention] 

10 The present invention relates to a caption MP3 player having a caption display 

function, caption MP3 data format and method of reproducing caption MP3 data and in 
particular, to a caption MP3 player having a function of displaying caption information on 
a display device in synchronism with corresponding audio information while outputting 
the audio information on stereo, a caption MP3 data format and a method of reproducing 

15 the caption MP3 data. 

[Description of the Related Art] 

Generally, MP3 means MPEG (Motion Picture Expert Group) Layer-3, and 
belongs to the audio technology in the MPEG field. MP3 is an audio file format that is 
formed by compressing existing data by audio data coding without deterioration in sound 
20 quality. Such an MP3 file has an AAU (hereinafter, "audio decoding unit") recording 

format. In other words, the MP3 file comprises a header, cyclic redundancy check (CRQ, 
audio information, and auxiliary data. Usually, the MP3 player playing MP3 files is used 
as a dedicated audio appliance for receiving compressed audio files and reproducing them 
in the form of audio information. 
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The conventional MP3 player mounts a liquid crystal display on which, in 
addition to the audio information, simple character data (e.g., a simple reference such as a 
title of a song) is displayed. However, such character data cannot synchronously display a 
character, i.e., the caption information, which corresponds to the audio information. 

5 In the conventional caption tape type using a cassette tape, the caption information 

and the audio information are stored on two tracks of the tape, respectively, that is used 
exclusively for audio. The audio information and the caption information are outputted to 
the speaker and liquid crystal display, respectively, by the caption cassette player. 

However, the caption tape type, in which the digital signal of the character data is 
10 converted into an analog signal and stored on the tape, has some problems when 

reproducing the data. The problems are that: the character signal results in noises by 
interrupting the audio signal; the audio signal results in an error in the character by 
interrupting the character signal; or the audio information is outputted on mono, not 
stereo, by storing the character data in one of the tracks on the tape. 



15 To solve the problem of mono output in the caption cassette, the tape is divided 

into four tracks or output is achieved in the stereophonic mode by signal synthesis. 
However, when four tracks are used, the player should comprise a four-track head to 
process the data on each track. When the data is outputted in the stereophonic mode by 
the signal synthesis, signal loss may occur during the analysis of the signal since it is not 

20 possible to completely divide the synthesized signals. Moreover, both signals result in 

noises by interrupting each other when reproducing the audio information 



SUMMARY OF THE INVENTION 



An object of the present invention is to provide an MP3 player having a caption 
information display function of storing audio information and corresponding caption 
5 information in an MP3 recording medium and reproducing the recorded data by 
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synchronizing the data with each other, and provide an MP3 data format and a method of 
reproducing the MP3 data. 

An MP3 file according to the present application comprises standard MP3 audio 
information and caption information having data to display the audio information in the 
5 form of characters, and thus is referred to hereinafter as caption MP3 data or caption MP3 

file. 

In the reproduction of the caption MP3 file, the audio information to be 
reproduced and the corresponding caption information should be synchronously 
outputted. For the synchronization of the audio information and the caption information, 
10 position data and time data can be used. Both or only one of the position data and the 

time data can be used. The position data may be that of the audio information that should 
be synchronized with the caption information or that of the caption information that 
should be synchronized with the audio information. The time data is to indicate the 
display time of the caption data that should be outputted through a display device. 

15 The caption MP3 player according to the present invention reproduces the caption 

MP3 data comprising audio information and corresponding caption information (the 
caption information including position data and/or time data), the audio information 
having a standard MP3 file format comprising a header, audio data and auxiliary data, the 
caption MP3 player comprising: a storage means for storing the audio information and the 

20 corresponding caption information inputted thereto; a signal separating means for 

separating the audio information and caption information inputted from the storage 
means; a control means for controlling the storage and output of the information through 
the storage means and controlling the audio signal and the corresponding caption signal, 
which are separated by the signal separating means, to be synchronized; a caption output 

25 means for outputting the caption signal synchronized with the audio signal, which 

corresponds to the caption signal and is outputted from the audio output means, by 
inputting the output from the signal separating means, The caption MP3 data format 
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according to a first embodiment of the present invention comprises an audio information 
and a corresponding caption information, the audio information having a standard MP3 
file format with a header, audio data and auxiliary data, wherein the caption information 
includes position data and/or time data, and when the audio information is reproduced, 
5 the caption information synchronized with the reproduced audio information is outputted 

using the position data and/or the time data. 

The caption MP3 data format according to a second embodiment of the present 
invention has a format comprising a plurality of caption MP3 files, each of the caption 
MP3 files having audio information and corresponding caption information, the audio 
10 information having a standard MP3 file format with a header, audio data and auxiliary 

data; wherein the audio information is located before the caption information in the each 
MP3 file, and the caption information includes caption display time data showing the 
time indicated in a display device when the caption information is reproduced. 

According to the present invention, the caption information can be included in the 
15 standard MP3 information, in which the audio information is stored, and the caption 

information can be provided along with the audio information by synchronously 
outputting the caption information with the audio information using the position data 
and/or time data. 

The present invention has an advantage in that there is no noise by the 
20 intervention between the caption information and the audio information since they are 

separated. The present invention has an advantage in that the caption information is 
stored in the form of a digital signal so that the caption information can be stored in 
various formats such as an image, hypertexter, text, etc. and that the deterioration in 
sound quality by repeated reproduction is prevented. 

25 Further, there is an advantage in that the file format, in which the audio and the 

caption are synchronized, makes searching and movement between intervals faster and 
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easier. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The drawings referenced herein form a part of the specification. Features shown 
5 in the drawing are meant as illustrative of only some embodiments of the invention, and 

not of all embodiments of the invention unless otherwise explicitly indicated. 
Implications to the contrary are otherwise not to be made. 

FIG. 1 is a block diagram of a caption MP3 player format according to the present 
invention. 

10 FIG. 2 is a view illustrating a caption MP3 data format according to the first 

embodiment of the present invention. 

FIG. 3 is a flow chart showing a caption MP3 data reproduction method according 
to the first embodiment of the present invention. 

15 FIG. 4 is a view illustrating the caption MP3 data format according to the second 

embodiment of the present invention. 

FIG. 5 is a flow chart showing the method of reproducing caption information in 
the caption MP3 data according to the second embodiment of the present invention. 

20 DETAILED DESCRIPTION OF THE INVENTION 

The construction and operation of the present invention will be explained in detail 
with reference to the accompanying drawings. 
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FIG. I is a block diagram of an MP3 player having a caption display function 
according to the present invention. The caption MP3 player according to the present 
invention comprises an input section (11), storage section (12) for storing audio 
information and/or caption information, signal separation section (13) for separating the 
5 audio information and the caption information, control section (10), audio output section 

(16), caption output section (14) and display section (15). 



The audio information and the caption information are inputted through the input 
section (11) from a recording medium (18) in which caption MP3 files are recorded. The 
input section (1 1) is preferably an electric circuit comprising a connection port. The audio 
10 information and the caption information inputted through the input section (1 1) are stored 

in the storage section (12), preferably a memory cell. The signal separation section (13) 
separates the audio information and the caption information stored in the storage section 
(12). 



The control section (10) controls the storage of information in the storage section 
15 (12) or the output of information from the storage section (12), and synchronizes audio 

signals and caption signals that are separated in the signal separation section (13). Also, 
the control section (10) counts playing time while the audio information is being 
reproduced. 



The control section (10) is preferably a microcomputer having function of 
20 processing signals or a control circuit comprising a microcomputer, and is formed to be 

controlled by a user. 



The audio output section (16) receives the audio signal corresponding to the audio 
information among the information from the signal separation section (13) and sends the 
signal to a left speaker (L SPK) and a right speaker (R SPK) so that the signal is outputted 
25 as an audio signal that can be heard. The caption output section (14) outputs the caption 

signal corresponding to the caption information among the information from the signal 
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separation section (13) in synchronism with the audio signal outputted from the audio 
output section (16). 

The display section (15) displays the caption corresponding to the caption signal 
5 outputted from the caption output section (14) in a visible form on the screen. Preferably, 

the display section is a liquid crystal display (LCD). 

In the caption MP3 player of the present invention having such a construction as 
described above, when the audio information and the caption information are inputted 
through the input section (11) from the caption MP3 file recorded in the recording 

10 medium (18), the audio information and the caption information are stored in the storage 

section (12) through the control of the control section (10). Each information stored in the 
storage section (12) is outputted from the storage section (12) through the control section 
(10) controlled by the user when the output of information is required. The outputted 
information is separated into audio information and caption information through the 

15 signal separation section (13), and the separated audio signal is outputted to the speaker 

in the monophonic or stereophonic mode through the audio output section (16). The 
separated caption signal is synchronized with the audio signal and outputted on the 
display section (15) through the caption output section (14). 

First Embodiment 

20 A caption MP3 data format according to the first embodiment of the present 

invention will be described. 

FIG. 2 is a view showing a caption MP3 data format according to an embodiment 
of the present invention. The caption MP3 information according to the present invention 
consists of audio information (20) and caption information (22). The audio information 
25 has a standard MP3 file format with a header, CRC, audio data and auxiliary data. The 

header is located on a fixed field of 32-bits and in that field, information, such as a layer, 
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sampling frequency and remaining frame, is contained. As an option, the existence or 
nonexistence of CRC depends on the header. Audio data is compressed data, and the 
length of data depends on the kind of the data. The auxiliary data, as a user definition 
area, includes additional information and is variable in dimension. 

5 Each caption information (22) comprises a start synchronization code (1), 

reproduction number data (2) and information data (8). It is not necessary to arrange these 
elements in the order as shown in FIG. 2. The arrangement shown in FIG. 2 is just an 
example for description. 

The start synchronization code (1) of the caption information (22) means the 
10 beginning of the caption information. The reproduction number data (2) is located after 

the start synchronization code (1) and indicates the number for indicating to which audio 
information frame among a plurality of pieces of audio information (20) the caption 
information corresponds. It can be understood that the reproduction number is the 
position data of the audio information with which the caption information should be 
15 synchronized or the position data of the caption information with which the audio data 

should be synchronized. 

The reproduction number data indicates the number, which is used for reference 
when the audio information (20) and the caption information (22) are reproduced, and is 
formed with a size of 4-bits, for example. 

20 The information data (8) include the related information such as the address of 

data or the kind of data to be stored and also includes, for example, reproduction address 
data (3), information identification code (4), selection code (5) and caption data (6). 

The reproduction address data (3) shows the reproduction number, by which the 
caption information is combined with each other when a plurality of pieces of caption 
25 information are in the form of one word or picture, and is formed with a size of 4-bits, for 
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example. To form a paragraph of large amount, at least one or more pieces of the caption 
information are required. The information identification code (4) shows of what type a 
stored information file is. The file can be in the form of an image file, hypertext file 
(HTML) or text file, for example, which is adapted for the display device. 

5 The selection code (5) indicates at least one of the language form used in the 

stored information, operation time and display mode of the display section (15). The form 
of the used language can be, for example, Korean (KOR), Japanese QP), English (USA), 
etc., and the operation time is the time at which the caption information should be 
outputted. By using the operation time, the caption information is synchronized with the 
10 audio information. 

The display mode shows whether the caption in the form of a word or sentence is 
outputted on the display section (15) in sequence or at once, and determines in what form 
(for example, columns and 4 lines or 24 columns and 2 lines) the characters should be 
displayed. 

15 By using the operation time of the selection code (5), it is possible to output the 

caption information in synchronism with the audio information. For such 
synchronization, the operation time data and the reproduction number data (2) can be 
used together or separately. 

The caption data (6) shows the caption character outputted from the MP3 
20 recording medium. The character stored at this time can be, for example, in the form of an 

image, hypertext, text, etc. When a caption information group (22a) comprising a 
plurality of pieces of caption information (22) formed as such is added to an W3 data 
format (20a) comprising several pieces of audio information (20) of an audio decoding 
unit and is outputted from the recording medium in which the MP3 audio information 
25 (20) is stored, the caption character is outputted in synchronism with each audio signal 

extracted from a plurality of pieces of audio information (20). The MP3 data group (20a) 
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forms one MP3 file that for example, corresponds to a song. The caption data group (22a) 
includes the contents of caption corresponding to one MP3 data format (20a), and each of 
a plurality of pieces of caption information (22) corresponds to one word or sentence that 
forms the contents of the caption. 

5 

The recording medium comprises an optical recording apparatus such as a 
compact disc, audio tape, magnetic recording apparatus, such as a hard disc, and memory. 

A plurality of pieces of audio information (20) included in the MP3 audio 
information (20a) each has a 32-bit header and 16-bit CRC, audio data and additional 
10 data. 

Reproduction of the information having such a caption MP3 data format will be 
explained with reference to the flow chart of FIG. 3. 

The recording medium is inserted into an apparatus reproducing the signal stored 
15 in the recording medium, e.g., the MP3 player shown in FIG. 1, and the stored 

information is reproduced by the control of the control circuit including a microcomputer 
of the reproducing apparatus. 

At least one of the audio information (20) and the caption information (22) is 
stored in the MP3 recording medium (step 30) so that the synchronized audio signal and 
20 the caption signal are outputted for reproduction. The control circuit determines whether 

the audio information (20) only exists in the information stored in the recording medium 
(step 32). 

At the determining step (32), when it is determined that the audio information 
(20) exists in the recording medium without the caption information (22), the audio 
25 information (20) only is outputted from the recording medium (step 34) because the 
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caption information (22) is not outputted and thus does not exist. 

However, when it is determined that the caption information (22) exists in the 
recording medium, the caption signal and the audio signal are reproduced in synchronism 
with the caption information according to the existence and/or non- existence of the audio 
5 information (20) in the recording medium, and then the reproduction of the next caption 

signal or audio signal is repeated (steps 36-48). 

As a first determining step, to reproduce all of the desired caption information and 
the audio information during the above repetition, it should be determined first whether 
the caption information (22) and the audio information (20) exist together in the 
10 recording medium (step 36). 

At the first step (36), if it is determined that the caption information (22) exists 
without the audio information (20), the caption information (22) only is outputted (step 
38). 

In the meantime, the caption information (22) exists in the first step (36) together 
15 with the audio information (22), the compressed audio information (20) is decoded for 

reproduction (step 40). 

As a second determining step, it is determined whether the caption information 
(22) corresponding to the audio information (20), which is decoded at the decoding step 
(40) so that when the reproduction apparatus outputs an audio, a corresponding caption 
20 can be outputted (step 42). If there is no corresponding caption information, output of 

information is maintained by continuously outputting the already outputted caption 
information or a caption information having a blank character (step 44). 
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However, if there is caption information (22) corresponding to the audio 
information (20), the caption information is decoded according to the file form of the 
caption information, for example, image, hypertext, text, etc (step 46). 

Then, the caption information is synchronized with the corresponding audio 
5 information, which is decoded at the decoding step (40), in the reproduction apparatus 

and outputted (step 48), and the first determining step (36) returns to output the next 
caption information. 

For example, by the caption information (22) added to the audio information (20), 
the sound "beau-" and the character "beau-" synchronized with the sound "beau-" are 
outputted from the respective output apparatuses, i.e., the speaker from which the audio 
information (20) is outputted and the display device from which the caption information 
(22) is concurrently outputted. At this time, since the audio information (20) and the 
caption information (22) have a capacity capable of storing at least a part of the audio 
signal and the caption signal, which are to be "beau-," the audio information (20) and the 
corresponding caption information (22) are required to store words such as "beautiful". 

In this manner, the audio information (20) stored in the caption MP3 recording 
medium and the caption information (22) synchronized to correspond to the audio 
information are simultaneously reproduced as audio and caption through the respective 
output apparatuses. 

20 Second Embodiment 

As shown in FIG. 4, an MP3 data format (50) according to a second embodiment 
of the present invention comprises a plurality of caption MP3 files (50a, 50b,...). Each 
caption MP3 file has audio information (52) and caption information (54). The audio 
information (52) comprises standard MPEG audio files as that in the first embodiment. 
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In FIG. 4, the caption information (54) comes after the audio information (52). In 
a different way, the caption information (54) could come before the audio information 
(52). 

5 However, when considering the wide utility of software for MP3 file 

reproduction, it is preferable to use the structure shown in FIG. 4. 

In FIG. 4, one MP3 file (e.g., 50a) appears to comprise one audio information 
(52a) and one caption information (54a). However, that is just for simplification of the 
drawing. 

10 One skilled in the art could understand that substantially in the most caption MP3 

files, one file comprises a plurality of pieces of audio information and a plurality of pieces 
of corresponding caption information. One MP3 file corresponds to one paragraph of 
phonetic information that is divided, for example, by a tune of song and a predetermined 
basis (e.g., theme). 

15 Each of the caption information 1 and 2 (54a and 54b) comprises a caption start 

synchronization signal (56), caption data 1, 2,... N (58a, 58b,..., 58n), text type (67) and 
caption identification code (62). The caption start synchronization signal (56), for 
example, as 4-byte data, indicates the start position in which the caption information is 
contained. The caption data (58) contains the character information to be displayed on the 

20 real screen and its size changes according to the character information. The caption data 

(58) will be explained hereinafter. 

Text type (60) determines the type, i.e., form of text output so that for example, 
the character information in the caption data (58) is outputted, being formed into 20 
columns and 4 lines or 24 columns and 2 lines. The caption identification code (62) is a 
25 code that identifies whether the data format is the caption MP3 file. Each of the caption 

data 1, 2,..., N (58a, 58b,..., 58n) is formed to include a caption display time (64), 
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sentence start identification (66), caption (68), additional data (70), option (72) and the 
size of data (74). 

The caption display time (64) as the time data of a point of time to display the 
caption is sized, for example, to be 7 bytes. The sentence start identification (66) is a code 
5 to find the beginning of the sentence when the sentence is displayed on different screens. 

The caption (68) is character data to be displayed on the screen, and the additional 
data (70) is used when an information except the option is required and contains the 
information that shows the form of the caption information file (image file, hypertext file, 
text file) or indicates the language of the caption information. The option (72) is a storage 
10 place for so-called optional matters that stores the information such as a scroll form (e.g., 

the character is displayed, flowing on the screen, or a previous sentence disappears slowly 
and the next sentence appears slowly on the same place) or a flash (e.g., flickering of 
letters). Into the data size (74), information on the length of the caption data is inputted. 

In the caption MP3 file of such an information format, the audio information and 
15 the caption information are synchronized with each other by the caption display time (64) 

data. 

The synchronization by the use of time information is more advantageous than 
that by the use of the position data and has a lower rate of failure. 

In the meantime, as described above, the standard MP3 file has a format 
20 comprising a header, a CRC, audio data and additional data. It can be considered that the 

caption information is included in the additional data of the standard MP3 file. However, 
since the compression rate varies depending on the amount of the audio data, it is 
therefore easier to add the caption information as a separate file format as in the first and 
second embodiments according to the present invention than to include the caption 
25 information in the additional data. Further, there are many advantages such as the 
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prevention of noise by the interruption occurring between the caption information and the 
audio information. 

FIG. 5 is a flow chart showing the method of reproducing the caption information 
in the caption MP3 file according to the second embodiment of the present invention. 

5 The file stored in the recording medium is opened; it is identified using the 

caption identification code (62) of the caption information (54) whether the caption 
information is in the file to be reproduced; the caption information is stored in the storage 
section (12) of the caption MP3 player if the caption information is included in the file 
and then the caption information is reproduced. When the reproduction of the caption 

10 information initiates, the caption information is initialized (step 80). At the step of 

initializing the caption information (80), it is determined whether to delete the caption 
displayed on the display device and to which file the caption data is attached. The 
reproduction terminates at the termination step (85). Otherwise, the playing time, which is 
reproduced in units of 1/1000 sec, is brought (step 86). This playing time has a value 

15 counted by the reproduction apparatus, for example, the control section (10) of the 

caption MP3 player shown in FIG. 1. The playing time is compared with the caption time 
(step 88). The caption time, as a caption display time (64a) of the caption data (58), is the 
time data value of a point of time at which the caption is displayed. The next caption time 
data is brought (step 90) and then the caption information is outputted (step 92). After 

20 returning to the termination- of- reproduction determining step 84 (step 94), steps 86-92 

are repeated. 

Although the present invention has been described with reference to the drawings, 
it is understood that this description is not to limit the invention to the embodiments 
25 shown in the drawings but simply to explain the invention. One skilled in the art will 

understand that various changes and modifications can be made from the embodiments 
disclosed in the specification. Therefore, the scope of the present invention should be 
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defined by the appended claims. 
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